Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 122
Filtrar
1.
Chem Res Toxicol ; 37(4): 580-589, 2024 Apr 15.
Artigo em Inglês | MEDLINE | ID: mdl-38501392

RESUMO

The desirable pharmacological properties and a broad number of therapeutic activities have made peptides promising drugs over small organic molecules and antibody drugs. Nevertheless, toxic effects, such as hemolysis, have hampered the development of such promising drugs. Hence, a reliable computational tool to predict peptide hemolytic toxicity is enormously useful before synthesis and experimental evaluation. Currently, four web servers that predict hemolytic activity using machine learning (ML) algorithms are available; however, they exhibit some limitations, such as the need for a reliable negative set and limited application domain. Hence, we developed a robust model based on a novel theoretical approach that combines network science and a multiquery similarity searching (MQSS) method. A total of 1152 initial models were constructed from 144 scaffolds generated in a previous report. These were evaluated on external data sets, and the best models were fused and improved. Our best MQSS model I1 outperformed all state-of-the-art ML-based models and was used to characterize the prevalence of hemolytic toxicity on therapeutic peptides. Based on our model's estimation, the number of hemolytic peptides might be 3.9-fold higher than the reported.


Assuntos
Hemólise , Peptídeos , Humanos , Sequência de Aminoácidos , Peptídeos/farmacologia , Peptídeos/química , Algoritmos , Aprendizado de Máquina
2.
ACS Omega ; 9(8): 8923-8939, 2024 Feb 27.
Artigo em Inglês | MEDLINE | ID: mdl-38434903

RESUMO

Recent reports have suggested that the susceptibility of cells to SARS-CoV-2 infection can be influenced by various proteins that potentially act as receptors for the virus. To investigate this further, we conducted simulations of viral dynamics using different cellular systems (Vero E6, HeLa, HEK293, and CaLu3) in the presence and absence of drugs (anthelmintic, ARBs, anticoagulant, serine protease inhibitor, antimalarials, and NSAID) that have been shown to impact cellular recognition by the spike protein based on experimental data. Our simulations revealed that the susceptibility of the simulated cell systems to SARS-CoV-2 infection was similar across all tested systems. Notably, CaLu3 cells exhibited the highest susceptibility to SARS-CoV-2 infection, potentially due to the presence of receptors other than ACE2, which may account for a significant portion of the observed susceptibility. Throughout the study, all tested compounds showed thermodynamically favorable and stable binding to the spike protein. Among the tested compounds, the anticoagulant nafamostat demonstrated the most favorable characteristics in terms of thermodynamics, kinetics, theoretical antiviral activity, and potential safety (toxicity) in relation to SARS-CoV-2 spike protein-mediated infections in the tested cell lines. This study provides mathematical and bioinformatic models that can aid in the identification of optimal cell lines for compound evaluation and detection, particularly in studies focused on repurposed drugs and their mechanisms of action. It is important to note that these observations should be experimentally validated, and this research is expected to inspire future quantitative experiments.

3.
J Comput Aided Mol Des ; 38(1): 9, 2024 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-38351144

RESUMO

Notwithstanding the wide adoption of the OECD principles (or best practices) for QSAR modeling, disparities between in silico predictions and experimental results are frequent, suggesting that model predictions are often too optimistic. Of these OECD principles, the applicability domain (AD) estimation has been recognized in several reports in the literature to be one of the most challenging, implying that the actual reliability measures of model predictions are often unreliable. Applying tree-based error analysis workflows on 5 QSAR models reported in the literature and available in the QsarDB repository, i.e., androgen receptor bioactivity (agonists, antagonists, and binders, respectively) and membrane permeability (highest membrane permeability and the intrinsic permeability), we demonstrate that predictions erroneously tagged as reliable (AD prediction errors) overwhelmingly correspond to instances in subspaces (cohorts) with the highest prediction error rates, highlighting the inhomogeneity of the AD space. In this sense, we call for more stringent AD analysis guidelines which require the incorporation of model error analysis schemes, to provide critical insight on the reliability of underlying AD algorithms. Additionally, any selected AD method should be rigorously validated to demonstrate its suitability for the model space over which it is applied. These steps will ultimately contribute to more accurate estimations of the reliability of model predictions. Finally, error analysis may also be useful in "rational" model refinement in that data expansion efforts and model retraining are focused on cohorts with the highest error rates.


Assuntos
Algoritmos , Relação Quantitativa Estrutura-Atividade , Reprodutibilidade dos Testes
4.
Astrobiology ; 23(10): 1083-1089, 2023 10.
Artigo em Inglês | MEDLINE | ID: mdl-37651215

RESUMO

A new chiral amplification mechanism based on a stochastic approach is proposed. The mechanism includes five different chemical species, an achiral substrate (A), two chiral forms (L, D), and two intermediary species (LA, DA). The process occurs within a small, semipermeable compartment that can be diffusively coupled with the outside environment. The study considers two alternative primary sources for chiral species within the compartment, one chemical and the other diffusive. As a remarkable fact, the chiral amplification process occurs due to stochastic fluctuations of an intermediary catalytic species (LA, DA) produced in situ, given the interaction of the chiral species with the achiral substrate. The net process includes two different steps: the synthesis of the catalyst (LA and DA) and the catalytic production of new chiral species from the substrate. Stochastic simulations show that proper parameterization can induce a robust chiral state within the compartment regardless of whether the system is open or closed. We also show how an increase in the non-catalytic production of chiral species tends to negatively impact the homochirality degree of the system. By its conception, the proposed mechanism suggests a deeper connection with how most biochemical processes occur in living beings, a fact that could open new avenues for studying this fascinating phenomenon.


Assuntos
Estereoisomerismo
5.
Bioinformatics ; 39(8)2023 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-37603724

RESUMO

MOTIVATION: Antimicrobial peptides (AMPs) are promising molecules to treat infectious diseases caused by multi-drug resistance pathogens, some types of cancer, and other conditions. Computer-aided strategies are efficient tools for the high-throughput screening of AMPs. RESULTS: This report highlights StarPep Toolbox, an open-source and user-friendly software to study the bioactive chemical space of AMPs using complex network-based representations, clustering, and similarity-searching models. The novelty of this research lies in the combination of network science and similarity-searching techniques, distinguishing it from conventional methods based on machine learning and other computational approaches. The network-based representation of the AMP chemical space presents promising opportunities for peptide drug repurposing, development, and optimization. This approach could serve as a baseline for the discovery of a new generation of therapeutics peptides. AVAILABILITY AND IMPLEMENTATION: All underlying code and installation files are accessible through GitHub (https://github.com/Grupo-Medicina-Molecular-y-Traslacional/StarPep) under the Apache 2.0 license.


Assuntos
Peptídeos , Software , Análise por Conglomerados , Reposicionamento de Medicamentos , Ensaios de Triagem em Larga Escala
6.
Antibiotics (Basel) ; 12(6)2023 Jun 05.
Artigo em Inglês | MEDLINE | ID: mdl-37370330

RESUMO

The antimicrobial resistance process has been accelerated by the over-prescription and misuse of antibiotics [...].

7.
Int J Biol Macromol ; 244: 125113, 2023 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-37257544

RESUMO

The coupling of Cas9 and its inhibitor AcrIIC3, both from the bacterium Neisseria meningitidis (Nme), form a homodimer of the (NmeCas9/AcrIIC3)2 type. This coupling was studied to assess the impact of their interaction with the crowders in the following environments: (1) homogeneous crowded, (2) heterogeneous, and (3) microheterogeneous cytoplasmic. For this, statistical thermodynamic models based on the scaled particle theory (SPT) were used, considering the attractive and repulsive protein-crowders contributions and the stability of the formation of spherocylindrical homodimers and the effects of changes in the size of spherical dimers were estimated. Studies based on models of dynamics, elastic networks, and statistical potentials to the formation of complexes NmeCas9/AcrIIC3 using PEG as the crowding agent support the predictions from SPT. Macromolecular crowding stabilizes the formation of the dimers, being more significant when the attractive protein-crowder interactions are weaker and the crowders are smaller. The coupling is favored towards the formation of spherical and compact dimers due to crowding addition (excluded-volume effects) and the thermodynamic stability of the dimers is markedly dependent on the size of the crowders. These results support the experimental mechanistic proposal of inhibition of NmeCas9 mediated by AcrIIC3.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Substâncias Macromoleculares , Polímeros , Termodinâmica
8.
Antibiotics (Basel) ; 12(4)2023 Apr 13.
Artigo em Inglês | MEDLINE | ID: mdl-37107109

RESUMO

Microbial biofilms cause several environmental and industrial issues, even affecting human health. Although they have long represented a threat due to their resistance to antibiotics, there are currently no approved antibiofilm agents for clinical treatments. The multi-functionality of antimicrobial peptides (AMPs), including their antibiofilm activity and their potential to target multiple microbes, has motivated the synthesis of AMPs and their relatives for developing antibiofilm agents for clinical purposes. Antibiofilm peptides (ABFPs) have been organized in databases that have allowed the building of prediction tools which have assisted in the discovery/design of new antibiofilm agents. However, the complex network approach has not yet been explored as an assistant tool for this aim. Herein, a kind of similarity network called the half-space proximal network (HSPN) is applied to represent/analyze the chemical space of ABFPs, aiming to identify privileged scaffolds for the development of next-generation antimicrobials that are able to target both planktonic and biofilm microbial forms. Such analyses also considered the metadata associated with the ABFPs, such as origin, other activities, targets, etc., in which the relationships were projected by multilayer networks called metadata networks (METNs). From the complex networks' mining, a reduced but informative set of 66 ABFPs was extracted, representing the original antibiofilm space. This subset contained the most central to atypical ABFPs, some of them having the desired properties for developing next-generation antimicrobials. Therefore, this subset is advisable for assisting the search for/design of both new antibiofilms and antimicrobial agents. The provided ABFP motifs list, discovered within the HSPN communities, is also useful for the same purpose.

9.
Biosystems ; 227-228: 104904, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37088349

RESUMO

Inspired in a coenzyme-like behavior, an alternative mechanism to induce homochirality within a small vesicle is proposed. The system includes six different chemical species: an achiral substrate A, the enantiomeric forms L and D, a coenzyme E and two intermediate catalytic forms LE and DE. Whereas the coenzyme and the intermediate catalytic forms are trapped within the vesicle, the substrate and the two enantiomeric forms are able to diffuse selectively across the vesicle boundary. Instead of using autocatalysis, the production of new enantiomers includes two different steps, the production of intermediate catalytic species (LE, DE) and the catalytic production of new enantiomers from the substrate. Using a suitable parameterization, we found that the chiral evolution of the system is highly dependent on the total amount of coenzyme within the vesicle, regardless of whether the surrounding membrane is permeable or not. However, the existence of large flows from the outside can destabilize the homochiral state inside the vesicle. In general, homochiral states tend to arise when the amount of coenzyme is quite low, a value that can vary according to the parametrization. On the other hand, the system tends to decrease the enantiomeric excess when the coenzyme levels are high enough. In general, the appearance of homochirality is conditioned by stochastic fluctuations in coenzyme levels within the vesicle, an effect that is gradually amplified throughout the entire process of enantiomer synthesis.


Assuntos
Coenzimas , Estereoisomerismo , Catálise
10.
Mol Inform ; 42(6): e2200227, 2023 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-36894503

RESUMO

Predicting the likely biological activity (or property) of compounds is a fundamental and challenging task in the drug discovery process. Current computational methodologies aim to improve their predictive accuracies by using deep learning (DL) approaches. However, non-DL based approaches for small- and medium-sized chemical datasets have demonstrated to be most suitable for. In this approach, an initial universe of molecular descriptors (MDs) is first calculated, then different feature selection algorithms are applied, and finally, one or several predictive models are built. Herein we demonstrate that this traditional approach may miss relevant information by assuming that the initial universe of MDs codifies all relevant aspects for the respective learning task. We argue that this limitation is mainly because of the constrained intervals of the parameters used in the algorithms that compute MDs, parameters that define the Descriptor Configuration Space (DCS). We propose to relax these constraints in an open CDS approach, so that a larger universe of MDs can be initially considered. We model the generation of MDs as a multicriteria optimization problem and tackle it with a variant of the standard genetic algorithm. As a novel component, the fitness function is computed by aggregating four criteria via the Choquet integral. Experimental results show that the proposed approach generates a meaningful DCS by improving state-of-the-art approaches in most of the benchmarking chemical datasets accounted for.


Assuntos
Algoritmos , Relação Quantitativa Estrutura-Atividade , Descoberta de Drogas , Benchmarking
11.
ACS Omega ; 7(50): 46012-46036, 2022 Dec 20.
Artigo em Inglês | MEDLINE | ID: mdl-36570318

RESUMO

Antimicrobial peptides (AMPs) have appeared as promising compounds to treat a wide range of diseases. Their clinical potentialities reside in the wide range of mechanisms they can use for both killing microbes and modulating immune responses. However, the hugeness of the AMPs' chemical space (AMPCS), represented by more than 1065 unique sequences, has represented a big challenge for the discovery of new promising therapeutic peptides and for the identification of common structural motifs. Here, we introduce network science and a similarity searching approach to discover new promising AMPs, specifically antiparasitic peptides (APPs). We exploited the network-based representation of APPs' chemical space (APPCS) to retrieve valuable information by using three network types: chemical space (CSN), half-space proximal (HSPN), and metadata (METN). Some centrality measures were applied to identify in each network the most important and nonredundant peptides. Then, these central peptides were considered as queries (Qs) in group fusion similarity-based searches against a comprehensive collection of known AMPs, stored in the graph database StarPepDB, to propose new potential APPs. The performance of the resulting multiquery similarity-based search models (mQSSMs) was evaluated in five benchmarking data sets of APP/non-APPs. The predictions performed by the best mQSSM showed a strong-to-very-strong performance since their external Matthews correlation coefficient (MCC) values ranged from 0.834 to 0.965. Outstanding MCC values (>0.85) were attained by the mQSSM with 219 Qs from both networks CSN and HSPN with 0.5 as similarity threshold in external data sets. Then, the performance of our best mQSSM was compared with the APPs prediction servers AMPDiscover and AMPFun. The proposed model showed its relevance by outperforming state-of-the-art machine learning models to predict APPs. After applying the best mQSSM and additional filters on the non-APP space from StarPepDB, 95 AMPs were repurposed as potential APP hits. Due to the high sequence diversity of these peptides, different computational approaches were applied to identify relevant motifs for searching and designing new APPs. Lastly, we identified 11 promising APP lead candidates by using our best mQSSMs together with diversity-based network analyses, and 24 web servers for activity/toxicity and drug-like properties. These results support that network-based similarity searches can be an effective and reliable strategy to identify APPs. The proposed models and pipeline are freely available through the StarPep toolbox software at http://mobiosd-hub.com/starpep.

12.
Sci Rep ; 12(1): 19969, 2022 11 19.
Artigo em Inglês | MEDLINE | ID: mdl-36402831

RESUMO

Primary hyperoxaluria type 1 (PHT1) treatment is mainly focused on inhibiting the enzyme glycolate oxidase, which plays a pivotal role in the production of glyoxylate, which undergoes oxidation to produce oxalate. When the renal secretion capacity exceeds, calcium oxalate forms stones that accumulate in the kidneys. In this respect, detailed QSAR analysis, molecular docking, and dynamics simulations of a series of inhibitors containing glycolic, glyoxylic, and salicylic acid groups have been performed employing different regression machine learning techniques. Three robust models with less than 9 descriptors-based on a tenfold cross (Q2 CV) and external (Q2 EXT) validation-were found i.e., MLR1 (Q2 CV = 0.893, Q2 EXT = 0.897), RF1 (Q2 CV = 0.889, Q2 EXT = 0.907), and IBK1 (Q2 CV = 0.891, Q2 EXT = 0.907). An ensemble model was built by averaging the predicted pIC50 of the three models, obtaining a Q2 EXT = 0.933. Physicochemical properties such as charge, electronegativity, hardness, softness, van der Waals volume, and polarizability were considered as attributes to build the models. To get more insight into the potential biological activity of the compouds studied herein, docking and dynamic analysis were carried out, finding the hydrophobic and polar residues show important interactions with the ligands. A screening of the DrugBank database V.5.1.7 was performed, leading to the proposal of seven commercial drugs within the applicability domain of the models, that can be suggested as possible PHT1 treatment.


Assuntos
Simulação de Dinâmica Molecular , Relação Quantitativa Estrutura-Atividade , Simulação de Acoplamento Molecular , Oxirredutases do Álcool
13.
Front Chem ; 10: 959143, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36277354

RESUMO

This study introduces a set of fuzzy spherically truncated three-dimensional (3D) multi-linear descriptors for proteins. These indices codify geometric structural information from kth spherically truncated spatial-(dis)similarity two-tuple and three-tuple tensors. The coefficients of these truncated tensors are calculated by applying a smoothing value to the 3D structural encoding based on the relationships between two and three amino acids of a protein embedded into a sphere. At considering, the geometrical center of the protein matches with center of the sphere, the distance between each amino acid involved in any specific interaction and the geometrical center of the protein can be computed. Then, the fuzzy membership degree of each amino acid from an spherical region of interest is computed by fuzzy membership functions (FMFs). The truncation value is finally a combination of the membership degrees from interacting amino acids, by applying the arithmetic mean as fusion rule. Several fuzzy membership functions with diverse biases on the calculation of amino acids memberships (e.g., Z-shaped (close to the center), PI-shaped (middle region), and A-Gaussian (far from the center)) were considered as well as traditional truncation functions (e.g., Switching). Such truncation functions were comparatively evaluated by exploring: 1) the frequency of membership degrees, 2) the variability and orthogonality analyses among them based on the Shannon Entropy's and Principal Component's methods, respectively, and 3) the prediction performance of alignment-free prediction of protein folding rates and structural classes. These analyses unraveled the singularity of the proposed fuzzy spherically truncated MDs with respect to the classical (non-truncated) ones and respect to the MDs truncated with traditional functions. They also showed an improved prediction power by attaining an external correlation coefficient of 95.82% in the folding rate modelling and an accuracy of 100% in distinguishing structural protein classes. These outcomes are better than the ones attained by existing approaches, justifying the theoretical contribution of this report. Thus, the fuzzy spherically truncated-based protein descriptors from MuLiMs-MCoMPAs (http://tomocomd.com/mulims-mcompas) are promising alignment-free predictors for modeling protein functions and properties.

14.
Antibiotics (Basel) ; 11(7)2022 Jul 13.
Artigo em Inglês | MEDLINE | ID: mdl-35884190

RESUMO

In the last two decades many reports have addressed the application of artificial intelligence (AI) in the search and design of antimicrobial peptides (AMPs). AI has been represented by machine learning (ML) algorithms that use sequence-based features for the discovery of new peptidic scaffolds with promising biological activity. From AI perspective, evolutionary algorithms have been also applied to the rational generation of peptide libraries aimed at the optimization/design of AMPs. However, the literature has scarcely dedicated to other emerging non-conventional in silico approaches for the search/design of such bioactive peptides. Thus, the first motivation here is to bring up some non-standard peptide features that have been used to build classical ML predictive models. Secondly, it is valuable to highlight emerging ML algorithms and alternative computational tools to predict/design AMPs as well as to explore their chemical space. Another point worthy of mention is the recent application of evolutionary algorithms that actually simulate sequence evolution to both the generation of diversity-oriented peptide libraries and the optimization of hit peptides. Last but not least, included here some new considerations in proteogenomic analyses currently incorporated into the computational workflow for unravelling AMPs in natural sources.

15.
Antibiotics (Basel) ; 11(3)2022 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-35326864

RESUMO

Peptide-based drugs are promising anticancer candidates due to their biocompatibility and low toxicity. In particular, tumor-homing peptides (THPs) have the ability to bind specifically to cancer cell receptors and tumor vasculature. Despite their potential to develop antitumor drugs, there are few available prediction tools to assist the discovery of new THPs. Two webservers based on machine learning models are currently active, the TumorHPD and the THPep, and more recently the SCMTHP. Herein, a novel method based on network science and similarity searching implemented in the starPep toolbox is presented for THP discovery. The approach leverages from exploring the structural space of THPs with Chemical Space Networks (CSNs) and from applying centrality measures to identify the most relevant and non-redundant THP sequences within the CSN. Such THPs were considered as queries (Qs) for multi-query similarity searches that apply a group fusion (MAX-SIM rule) model. The resulting multi-query similarity searching models (SSMs) were validated with three benchmarking datasets of THPs/non-THPs. The predictions achieved accuracies that ranged from 92.64 to 99.18% and Matthews Correlation Coefficients between 0.894-0.98, outperforming state-of-the-art predictors. The best model was applied to repurpose AMPs from the starPep database as THPs, which were subsequently optimized for the TH activity. Finally, 54 promising THP leads were discovered, and their sequences were analyzed to encounter novel motifs. These results demonstrate the potential of CSNs and multi-query similarity searching for the rapid and accurate identification of THPs.

16.
Mol Divers ; 26(3): 1383-1397, 2022 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34216326

RESUMO

With the advancement of combinatorial chemistry and big data, drug repositioning has boomed. In this sense, machine learning and artificial intelligence techniques offer a priori information to identify the most promising candidates. In this study, we combine QSAR and docking methodologies to identify compounds with potential inhibitory activity of vasoactive metalloproteases for the treatment of cardiovascular diseases. To develop this study, we used a database of 191 thermolysin inhibitor compounds, which is the largest as far as we know. First, we use Dragon's molecular descriptors (0-3D) to develop classification models using Bayesian networks (Naive Bayes) and artificial neural networks (Multilayer Perceptron). The obtained models are used for virtual screening of small molecules in the international DrugBank database. Second, docking experiments are carried out for all three enzymes using the Autodock Vina program, to identify possible interactions with the active site of human metalloproteases. As a result, high-performance artificial intelligence QSAR models are obtained for training and prediction sets. These allowed the identification of 18 compounds with potential inhibitory activity and an adequate oral bioavailability profile, which were evaluated using docking. Four of them showed high binding energies for the three enzymes, and we propose them as potential dual ACE/NEP inhibitors for the control of blood pressure. In summary, the in silico strategies used here constitute an important tool for the early identification of new antihypertensive drug candidates, with substantial savings in time and money.


Assuntos
Inteligência Artificial , Aprendizado de Máquina , Teorema de Bayes , Reposicionamento de Medicamentos , Humanos , Metaloproteases , Simulação de Acoplamento Molecular , Relação Quantitativa Estrutura-Atividade
17.
Sci Rep ; 10(1): 18074, 2020 10 22.
Artigo em Inglês | MEDLINE | ID: mdl-33093586

RESUMO

The increasing interest in bioactive peptides with therapeutic potentials has been reflected in a large variety of biological databases published over the last years. However, the knowledge discovery process from these heterogeneous data sources is a nontrivial task, becoming the essence of our research endeavor. Therefore, we devise a unified data model based on molecular similarity networks for representing a chemical reference space of bioactive peptides, having an implicit knowledge that is currently not explicitly accessed in existing biological databases. Indeed, our main contribution is a novel workflow for the automatic construction of such similarity networks, enabling visual graph mining techniques to uncover new insights from the "ocean" of known bioactive peptides. The workflow presented here relies on the following sequential steps: (i) calculation of molecular descriptors by applying statistical and aggregation operators on amino acid property vectors; (ii) a two-stage unsupervised feature selection method to identify an optimized subset of descriptors using the concepts of entropy and mutual information; (iii) generation of sparse networks where nodes represent bioactive peptides, and edges between two nodes denote their pairwise similarity/distance relationships in the defined descriptor space; and (iv) exploratory analysis using visual inspection in combination with clustering and network science techniques. For practical purposes, the proposed workflow has been implemented in our visual analytics software tool ( http://mobiosd-hub.com/starpep/ ), to assist researchers in extracting useful information from an integrated collection of 45120 bioactive peptides, which is one of the largest and most diverse data in its field. Finally, we illustrate the applicability of the proposed workflow for discovering central nodes in molecular similarity networks that may represent a biologically relevant chemical space known to date.


Assuntos
Algoritmos , Antineoplásicos/química , Biologia Computacional/métodos , Gráficos por Computador , Modelos Químicos , Fragmentos de Peptídeos/química , Aprendizado de Máquina não Supervisionado , Simulação por Computador , Bases de Dados Factuais , Humanos , Software
18.
Chem Res Toxicol ; 33(7): 1855-1873, 2020 07 20.
Artigo em Inglês | MEDLINE | ID: mdl-32406679

RESUMO

Drug-induced liver injury (DILI) is a key safety issue in the drug discovery pipeline and a regulatory concern. Thus, many in silico tools have been proposed to improve the hepatotoxicity prediction of organic-type chemicals. Here, classifiers for the prediction of DILI were developed by using QuBiLS-MAS 0-2.5D molecular descriptors and shallow machine learning techniques, on a training set composed of 1075 molecules. The best ensemble model build, E13, was obtained with good statistical parameters for the learning series, namely, the following: accuracy = 0.840, sensibility = 0.890, specificity = 0.761, Matthew's correlation coefficient = 0.660, and area under the ROC curve = 0.904. The model was also satisfactorily evaluated with Y-scrambling test, and repeated k-fold cross-validation and repeated k-holdout validation. In addition, an exhaustive external validation was also carried out by using two test sets and five external test sets, with an average accuracy value equal to 0.854 (±0.062) and a coverage equal to 98.4% according to its applicability domain. A statistical comparison of the performance of the E13 model, with regard to results and tools (e.g., Padel DDPredictor Software, Deep Learning DILIserver, and Vslead) reported in the literature, was also performed. In general, E13 presented the best global performance in all experiments. The sum of the ranking differences procedure provided a very similar grouping pattern to that of the M-ANOVA statistical analysis, where E13 was identified as the best model for DILI predictions. A noncommercial and fully cross-platform software for the DILI prediction was also developed, which is freely available at http://tomocomd.com/apps/ptoxra. This software was used for the screening of seven data sets, containing natural products, leads, toxic materials, and FDA approved drugs, to assess the usefulness of the QSAR models in the DILI labeling of organic substances; it was found that 50-92% of the evaluated molecules are positive-DILI compounds. All in all, it can be stated that the E13 model is a relevant method for the prediction of DILI risk in humans, as it shows the best results among all of the methods analyzed.


Assuntos
Doença Hepática Induzida por Substâncias e Drogas , Modelos Biológicos , Descoberta de Drogas , Aprendizado de Máquina , Relação Quantitativa Estrutura-Atividade , Software
19.
J Comput Chem ; 41(12): 1209-1227, 2020 05 05.
Artigo em Inglês | MEDLINE | ID: mdl-32058625

RESUMO

Advances to the distributed, multi-core and fully cross-platform QuBiLS-MIDAS software v2.0 (http://tomocomd.com/qubils-midas) are reported in this article since the v1.0 release. The QuBiLS-MIDAS software is the only one that computes atom-pair and alignment-free geometrical MDs (3D-MDs) from several distance metrics other than the Euclidean distance, as well as alignment-free 3D-MDs that codify structural information regarding the relations among three and four atoms of a molecule. The most recent features added to the QuBiLS-MIDAS software v2.0 are related (a) to the calculation of atomic weightings from indices based on the vertex-degree invariant (e.g., Alikhanidi index); (b) to consider central chirality during the molecular encoding; (c) to use measures based on clustering methods and statistical functions to codify structural information among more than two atoms; (d) to the use of a novel method based on fuzzy membership functions to spherically truncate inter-atomic relations; and (e) to the use of weighted and fuzzy aggregation operators to compute global 3D-MDs according to the importance and/or interrelation of the atoms of a molecule during the molecular encoding. Moreover, a novel module to compute QuBiLS-MIDAS 3D-MDs from their headings was also developed. This module can be used either by the graphical user interface or by means of the software library. By using the library, both the predictive models built with the QuBiLS-MIDAS 3D-MDs and the QuBiLS-MIDAS 3D-MDs calculation can be embedded in other tools. A set of predefined QuBiLS-MIDAS 3D-MDs with high information content and low redundancy on a set comprised of 20,469 compounds is also provided to be employed in further cheminformatics tasks. This set of predefined 3D-MDs evidenced better performance than all the universe of Dragon (v5.5) and PaDEL 0D-to-3D MDs in variability studies, whereas a linear independence study proved that these QuBiLS-MIDAS 3D-MDs codify chemical information orthogonal to the Dragon 0D-to-3D MDs. This set of predefined 3D-MDs would be periodically updated as long as new results be achieved. In general, this report highlights our continued efforts to provide a better tool for a most suitable characterization of compounds, and in this way, to contribute to obtaining better outcomes in future applications.

20.
J Theor Biol ; 485: 110039, 2020 01 21.
Artigo em Inglês | MEDLINE | ID: mdl-31589877

RESUMO

Novel 3D protein descriptors based on bilinear, quadratic and linear algebraic maps in Rn are proposed. The latter employs the kth 2-tuple (dis) similarity matrix to codify information related to covalent and non-covalent interactions in these biopolymers. The calculation of the inter-amino acid distances is generalized by using several dis-similarity coefficients, where normalization procedures based on the simple stochastic and mutual probability schemes are applied. A new local-fragment approach based on amino acid-types and amino acid-groups is proposed to characterize regions of interest in proteins. Topological and geometric macromolecular cutoffs are defined using local and total indices to highlight non-covalent interactions existing between the side-chains of each amino acid. Moreover, local and total indices calculations are generalized considering a LEGO approach, by using several aggregation operators. Collinearity and variability analyses are performed to evaluate every generalizing component applied to the definition of these novel indices. These experiments are oriented to reduce the number of MDs obtained for performing prediction models. The predictive power of the proposed indices was evaluated using two benchmark datasets, folding rate and secondary structural classification of proteins. The proposed MDs are modeled using the following strategies: Multiple Linear Regression (MLR) and Support Vector Machine (SVM), respectively. The best regression model developed for the folding rate of proteins yields a cross-validation coefficient of 0.875 (Test Set) and the best model developed for secondary structural classification obtained 98% of instances correctly classified (Test Set). These statistical parameters are superior to the ones obtained with existing MDs reported in the literature. Overall, the new theoretical generalization enhanced the information extraction into the MDs, allowing a better correlation between these two evaluated benchmark datasets and the proposed indices. The optimal theoretical configurations defined for the calculation of these MDs consider low collinearity and less information redundancy among them. These theoretical configurations and the software are available at http://tomocomd.com/mulims-mcompas.


Assuntos
Proteínas , Relação Quantitativa Estrutura-Atividade , Software , Aminoácidos , Modelos Lineares
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...